A first inspection of the dataset with 154 variables

importing the data

P = read.table("/Users/macbookpro/Documents/Bayesian Statistics/Project/Raw_data/Peptidi/154 variabili/101_peptidi-PreProcessed-IM-Step1-Step2-Step4-Step5-101.txt")
sum(is.na(P))
## [1] 1603792

the numbers of na is substantial

 vis_miss(P,warn_large_data = FALSE)
## Warning: `gather_()` was deprecated in tidyr 1.2.0.
## ℹ Please use `gather()` instead.
## ℹ The deprecated feature was likely used in the visdat package.
##   Please report the issue at <]8;;https://github.com/ropensci/visdat/issueshttps://github.com/ropensci/visdat/issues]8;;>.

the missing data is about 56%

skim(P)
Data summary
Name P
Number of rows 18290
Number of columns 154
_______________________
Column type frequency:
numeric 154
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
X703.401372865699 14526 0.21 1.79 0.57 0.34 1.39 1.73 2.15 4.22 ▂▇▅▁▁
X704.39762274031 11096 0.39 1.91 0.60 0.28 1.48 1.86 2.30 4.70 ▂▇▅▁▁
X705.384775176124 3685 0.80 10.73 9.14 0.24 3.78 7.26 15.25 50.30 ▇▃▂▁▁
X706.388721051968 8345 0.54 4.69 3.04 0.38 2.40 3.66 6.22 16.87 ▇▅▂▁▁
X721.385792389616 113 0.99 19.52 19.46 0.26 4.92 11.87 28.30 105.23 ▇▂▁▁▁
X722.388097317932 4233 0.77 7.45 6.94 0.31 2.50 4.72 10.07 37.51 ▇▂▁▁▁
X726.439655161571 7652 0.58 24.48 20.91 0.31 8.71 17.00 34.38 139.30 ▇▂▁▁▁
X738.404675559703 10718 0.41 3.19 1.25 0.26 2.35 3.06 3.93 11.79 ▅▇▁▁▁
X739.41085100008 12131 0.34 2.02 0.69 0.15 1.55 1.96 2.43 5.72 ▂▇▃▁▁
X743.385993639694 2663 0.85 4.50 3.30 0.19 1.92 3.41 6.43 20.39 ▇▃▂▁▁
X764.397713644451 6922 0.62 3.33 1.83 0.14 1.93 2.80 4.40 10.75 ▇▇▃▁▁
X766.477627979316 1534 0.92 2.68 0.99 0.16 1.97 2.59 3.29 10.56 ▅▇▁▁▁
X771.438511022179 13724 0.25 1.99 0.76 0.11 1.42 1.91 2.46 5.28 ▂▇▅▁▁
X795.474297481607 2804 0.85 2.32 0.84 0.27 1.69 2.21 2.84 7.17 ▃▇▂▁▁
X796.463347662883 12714 0.30 1.86 0.53 0.24 1.48 1.83 2.22 3.64 ▁▆▇▃▁
X797.475093209898 10529 0.42 1.92 0.61 0.18 1.48 1.86 2.28 4.54 ▁▇▆▁▁
X811.465860310989 7815 0.57 8.07 4.96 0.23 4.13 6.78 10.99 29.01 ▇▆▃▁▁
X812.462812396293 7692 0.58 4.21 2.48 0.14 2.27 3.60 5.71 14.21 ▇▇▃▁▁
X813.519064384011 10869 0.41 2.54 0.95 0.16 1.91 2.46 3.02 10.95 ▆▇▁▁▁
X816.472991849074 7581 0.59 2.77 0.94 0.16 2.10 2.74 3.41 7.56 ▂▇▅▁▁
X817.420770454515 11493 0.37 2.07 0.73 0.31 1.52 1.99 2.52 9.10 ▇▆▁▁▁
X818.43774134942 12785 0.30 1.44 0.50 0.22 1.09 1.39 1.76 5.17 ▅▇▁▁▁
X833.121888423167 12797 0.30 7.97 3.72 1.07 5.59 7.73 9.77 49.17 ▇▂▁▁▁
X839.132555143816 10988 0.40 5.01 1.68 1.30 4.08 4.78 5.55 28.22 ▇▁▁▁▁
X840.140934584387 13660 0.25 2.84 1.05 0.74 2.31 2.67 3.08 16.54 ▇▁▁▁▁
X841.129972137687 14242 0.22 1.83 1.11 0.18 1.31 1.66 1.99 28.23 ▇▁▁▁▁
X842.550729203612 9673 0.47 21.18 16.58 1.41 9.96 16.91 27.37 210.52 ▇▁▁▁▁
X843.552178547911 8940 0.51 10.29 6.04 0.91 5.97 9.11 13.15 70.12 ▇▂▁▁▁
X844.543168992869 8010 0.56 3.59 2.06 0.37 2.17 3.17 4.46 26.27 ▇▁▁▁▁
X855.100759090494 8216 0.55 14.52 4.84 1.30 11.21 13.94 17.07 60.01 ▆▇▁▁▁
X856.103634808692 8074 0.56 6.43 2.11 0.91 4.99 6.16 7.53 27.00 ▇▇▁▁▁
X857.103870843625 13083 0.28 2.76 0.85 0.76 2.21 2.65 3.13 11.27 ▇▅▁▁▁
X859.544439470389 11332 0.38 22.93 18.04 0.46 8.59 18.53 32.72 107.98 ▇▅▂▁▁
X860.532361434122 8410 0.54 8.53 7.24 0.34 3.23 6.10 11.62 47.71 ▇▂▁▁▁
X871.076852631064 12276 0.33 4.11 1.55 1.02 3.07 3.85 4.82 16.17 ▇▆▁▁▁
X872.078608509269 14095 0.23 1.99 0.71 0.33 1.55 1.89 2.29 7.29 ▅▇▁▁▁
X873.547616791839 7290 0.60 5.08 4.16 0.18 1.97 3.70 6.97 26.47 ▇▃▁▁▁
X874.538835593762 8075 0.56 2.80 1.84 0.25 1.40 2.24 3.74 12.24 ▇▅▂▁▁
X877.098035665664 12588 0.31 5.13 2.54 0.69 3.44 4.65 6.27 48.46 ▇▁▁▁▁
X878.099308962539 14472 0.21 2.36 1.23 0.23 1.55 2.07 2.89 20.73 ▇▁▁▁▁
X881.52287297445 8765 0.52 3.78 2.45 0.11 1.93 3.11 5.04 17.61 ▇▅▁▁▁
X899.512111937031 769 0.96 2.14 0.83 0.19 1.54 2.01 2.65 6.03 ▂▇▃▁▁
X900.519132680702 6240 0.66 1.55 0.54 0.27 1.15 1.48 1.89 3.95 ▂▇▅▁▁
X901.534339846564 7782 0.57 1.36 0.49 0.26 1.01 1.30 1.65 5.52 ▇▇▁▁▁
X913.486722997486 13637 0.25 2.04 2.72 0.32 1.19 1.51 1.89 46.21 ▇▁▁▁▁
X919.505647165412 14382 0.21 1.00 0.30 0.15 0.78 0.96 1.20 2.11 ▁▇▇▂▁
X929.552854874804 8384 0.54 1.18 0.38 0.24 0.91 1.14 1.41 3.36 ▃▇▂▁▁
X930.570286785265 8013 0.56 1.21 0.65 0.15 0.80 1.05 1.42 6.22 ▇▂▁▁▁
X944.531705809784 345 0.98 4.92 5.20 0.23 1.85 2.92 5.67 36.76 ▇▁▁▁▁
X945.538617455054 9510 0.48 3.36 2.61 0.26 1.48 2.46 4.39 16.87 ▇▃▁▁▁
X951.534369302263 14436 0.21 1.39 0.47 0.09 1.03 1.35 1.72 3.29 ▁▇▇▂▁
X966.520332432519 13523 0.26 1.86 0.78 0.17 1.29 1.75 2.31 4.95 ▃▇▅▁▁
X968.539465096924 1469 0.92 2.28 0.98 0.16 1.56 2.09 2.85 9.16 ▇▇▂▁▁
X969.543037542197 5259 0.71 1.52 0.62 0.09 1.05 1.41 1.92 4.73 ▃▇▃▁▁
X982.518846084674 13895 0.24 1.56 0.64 0.08 1.07 1.47 1.97 3.96 ▂▇▆▂▁
X988.547508384061 11392 0.38 1.09 0.36 0.09 0.82 1.04 1.32 2.82 ▁▇▅▁▁
X989.54010791306 12888 0.30 1.04 0.34 0.07 0.78 1.00 1.25 2.41 ▁▇▆▂▁
X1021.56290721623 14408 0.21 1.03 0.36 0.17 0.78 0.98 1.25 2.54 ▂▇▅▁▁
X1023.52260855018 9037 0.51 1.26 0.45 0.14 0.94 1.20 1.50 5.08 ▇▇▁▁▁
X1024.54808802905 10395 0.43 1.22 0.66 0.17 0.82 1.08 1.42 7.67 ▇▁▁▁▁
X1025.56691773401 3021 0.83 2.64 1.05 0.14 1.87 2.48 3.29 8.70 ▃▇▂▁▁
X1026.57088472708 12438 0.32 1.80 0.68 0.10 1.28 1.73 2.25 4.71 ▂▇▆▁▁
X1044.14754874948 13875 0.24 3.41 1.44 1.13 2.78 3.27 3.78 30.42 ▇▁▁▁▁
X1045.55388620346 6010 0.67 3.54 2.13 0.34 2.18 2.97 4.29 24.38 ▇▁▁▁▁
X1046.569250586 4931 0.73 5.05 3.13 0.34 2.82 4.31 6.59 25.71 ▇▃▁▁▁
X1047.58953545812 12328 0.33 4.00 1.72 0.17 2.82 3.80 4.96 13.59 ▃▇▂▁▁
X1066.13228395 10253 0.44 4.56 1.91 0.25 3.25 4.44 5.61 23.75 ▇▅▁▁▁
X1067.138669068 13449 0.26 3.04 1.12 0.64 2.34 2.95 3.57 13.21 ▇▅▁▁▁
X1067.57717570472 10623 0.42 6.80 5.58 0.14 2.93 4.87 8.94 45.13 ▇▂▁▁▁
X1068.57935749796 9900 0.46 4.25 3.15 0.27 2.03 3.24 5.52 25.00 ▇▂▁▁▁
X1081.58520829662 6992 0.62 4.90 4.43 0.12 1.76 3.22 6.72 29.84 ▇▂▁▁▁
X1082.11020311348 10184 0.44 2.47 1.09 0.32 1.71 2.40 3.02 13.34 ▇▃▁▁▁
X1082.59678591629 6539 0.64 3.52 2.62 0.15 1.64 2.68 4.62 18.81 ▇▃▁▁▁
X1083.11177342918 14341 0.22 1.57 0.66 0.25 1.19 1.52 1.84 6.91 ▇▆▁▁▁
X1083.59374890608 11292 0.38 1.84 0.91 0.15 1.22 1.68 2.24 7.81 ▇▇▁▁▁
X1099.62441990754 6355 0.65 5.21 6.16 0.17 1.60 2.93 6.44 73.29 ▇▁▁▁▁
X1100.61311177393 5431 0.70 9.38 11.61 0.24 2.28 4.76 11.74 121.79 ▇▁▁▁▁
X1101.61429665547 6473 0.65 4.78 5.61 0.10 1.31 2.52 6.03 56.85 ▇▁▁▁▁
X1131.62829179685 4453 0.76 6.13 6.23 0.16 1.50 3.69 8.78 39.99 ▇▂▁▁▁
X1132.6208985625 5516 0.70 4.17 3.84 0.16 1.33 2.74 5.85 24.80 ▇▂▁▁▁
X1133.62168504174 9127 0.50 2.11 1.36 0.10 1.10 1.70 2.78 9.04 ▇▅▂▁▁
X1153.61458557877 5009 0.73 5.02 4.36 0.09 1.65 3.34 7.37 26.51 ▇▂▂▁▁
X1154.61607341723 6682 0.63 3.60 2.71 0.09 1.48 2.68 5.19 16.53 ▇▃▂▁▁
X1155.62078684515 8020 0.56 1.55 0.90 0.14 0.84 1.31 2.07 5.56 ▇▆▂▁▁
X1169.60900885242 10565 0.42 3.52 2.17 0.13 1.74 3.12 4.99 15.09 ▇▆▂▁▁
X1170.60905838657 13564 0.26 2.24 1.35 0.16 1.10 2.00 3.18 6.85 ▇▆▅▂▁
X1171.61680180233 12116 0.34 1.12 0.49 0.10 0.74 1.03 1.42 3.11 ▃▇▅▁▁
X1198.7489884275 816 0.96 2.02 0.77 0.18 1.46 1.90 2.46 6.91 ▃▇▂▁▁
X1199.74616711023 3130 0.83 1.60 0.58 0.17 1.18 1.51 1.94 5.20 ▃▇▂▁▁
X1200.72216225828 5657 0.69 0.86 0.31 0.07 0.64 0.82 1.05 2.37 ▂▇▃▁▁
X1221.67497921762 14554 0.20 0.71 0.23 0.13 0.54 0.68 0.85 1.59 ▂▇▆▂▁
X1237.70248667576 13669 0.25 0.86 0.28 0.13 0.65 0.83 1.04 2.09 ▂▇▅▁▁
X1241.6702211025 13459 0.26 0.70 0.25 0.11 0.52 0.68 0.87 2.05 ▃▇▃▁▁
X1242.67907139247 9262 0.49 0.67 0.23 0.12 0.49 0.64 0.82 1.72 ▂▇▅▁▁
X1255.66639225835 14347 0.22 0.73 0.25 0.13 0.54 0.70 0.90 1.75 ▂▇▅▁▁
X1268.68566515798 13416 0.27 0.63 0.21 0.09 0.47 0.60 0.76 1.83 ▂▇▂▁▁
X1287.7198274242 13036 0.29 0.96 0.54 0.09 0.58 0.83 1.18 4.21 ▇▅▁▁▁
X1300.69101239099 10185 0.44 0.60 0.20 0.07 0.45 0.57 0.72 1.59 ▂▇▃▁▁
X1318.7055011406 12921 0.29 0.84 0.29 0.14 0.62 0.81 1.02 2.18 ▂▇▅▁▁
X1336.69651172215 2459 0.87 4.73 5.63 0.13 1.02 2.33 6.35 43.04 ▇▁▁▁▁
X1337.6996697515 6373 0.65 3.85 4.01 0.14 1.08 2.23 5.25 28.97 ▇▂▁▁▁
X1338.69026337081 6703 0.63 2.19 1.78 0.11 0.97 1.58 2.79 13.57 ▇▂▁▁▁
X1352.69068322102 2087 0.89 5.59 6.53 0.09 1.17 2.85 7.55 45.04 ▇▂▁▁▁
X1353.70217721775 9101 0.50 2.81 3.58 0.09 0.82 1.53 3.15 30.61 ▇▁▁▁▁
X1354.69509872675 11358 0.38 1.84 1.47 0.13 0.94 1.39 2.17 14.70 ▇▁▁▁▁
X1359.71920064559 13132 0.28 1.21 0.61 0.11 0.74 1.10 1.57 4.21 ▇▇▃▁▁
X1374.68868544969 5609 0.69 1.71 1.30 0.11 0.75 1.24 2.34 9.91 ▇▂▁▁▁
X1375.70367765744 9157 0.50 1.34 0.87 0.11 0.68 1.05 1.77 7.28 ▇▃▁▁▁
X1376.70057890047 9575 0.48 0.96 0.48 0.10 0.59 0.87 1.24 3.60 ▇▇▂▁▁
X1391.7310555169 13995 0.23 0.68 0.28 0.11 0.48 0.63 0.81 2.28 ▆▇▂▁▁
X1393.72199322531 13802 0.25 0.65 0.23 0.08 0.48 0.62 0.80 1.57 ▂▇▅▂▁
X1396.732964339 14244 0.22 0.72 0.26 0.13 0.52 0.70 0.88 1.96 ▃▇▃▁▁
X1397.74684785689 13563 0.26 0.61 0.21 0.09 0.45 0.60 0.75 1.56 ▂▇▅▁▁
X1445.76205363255 13373 0.27 0.50 0.18 0.09 0.38 0.48 0.59 2.11 ▇▆▁▁▁
X1446.76977484297 13444 0.26 0.53 0.18 0.07 0.40 0.51 0.64 1.83 ▅▇▁▁▁
X1460.77544240007 13099 0.28 2.98 2.06 0.06 1.20 2.67 4.34 13.50 ▇▆▂▁▁
X1462.78675966734 14211 0.22 1.31 0.71 0.06 0.74 1.20 1.74 4.52 ▇▇▃▁▁
X1467.80248378337 13794 0.25 0.59 0.25 0.07 0.40 0.55 0.73 1.74 ▅▇▃▁▁
X1468.79545331291 13959 0.24 0.50 0.21 0.06 0.34 0.45 0.62 1.72 ▆▇▂▁▁
X1482.7828636732 10174 0.44 1.62 0.98 0.09 0.86 1.39 2.20 6.98 ▇▅▂▁▁
X1483.7602040513 10438 0.43 1.10 0.65 0.11 0.61 0.92 1.43 4.91 ▇▅▁▁▁
X1490.79921206715 14061 0.23 0.45 0.16 0.05 0.33 0.42 0.54 1.39 ▃▇▂▁▁
X1505.81814466814 11371 0.38 0.54 0.20 0.08 0.40 0.52 0.67 1.59 ▃▇▃▁▁
X1506.82927677072 12647 0.31 0.69 0.28 0.08 0.47 0.66 0.88 1.94 ▃▇▅▁▁
X1520.76684236019 10519 0.42 0.66 0.30 0.09 0.43 0.61 0.85 2.15 ▆▇▃▁▁
X1548.79708685808 12093 0.34 0.51 0.17 0.09 0.39 0.49 0.61 1.29 ▂▇▃▁▁
X1549.80251865168 13769 0.25 0.45 0.15 0.07 0.34 0.43 0.54 1.05 ▁▇▅▁▁
X1567.81408356388 12086 0.34 1.53 1.47 0.09 0.60 1.05 1.91 20.13 ▇▁▁▁▁
X1570.82757902878 13451 0.26 0.85 0.51 0.10 0.48 0.72 1.08 4.18 ▇▃▁▁▁
X1587.83475034977 14150 0.23 0.40 0.14 0.08 0.30 0.38 0.49 1.38 ▆▇▂▁▁
X1588.84025634597 13937 0.24 0.38 0.13 0.06 0.29 0.37 0.47 0.93 ▂▇▅▁▁
X1617.8705321642 14048 0.23 0.40 0.13 0.06 0.30 0.38 0.48 1.22 ▃▇▂▁▁
X1628.84115733793 13328 0.27 0.47 0.18 0.08 0.34 0.44 0.57 1.52 ▅▇▂▁▁
X1629.84658199452 14273 0.22 0.42 0.16 0.07 0.31 0.39 0.51 1.53 ▇▇▂▁▁
X1643.86831200385 14268 0.22 0.36 0.12 0.08 0.27 0.34 0.43 1.03 ▃▇▂▁▁
X1676.89152948382 13388 0.27 0.34 0.11 0.07 0.26 0.32 0.40 0.94 ▃▇▂▁▁
X1677.8916781126 13927 0.24 0.31 0.11 0.03 0.23 0.29 0.37 0.79 ▁▇▅▁▁
X1708.89992075846 13692 0.25 0.43 0.17 0.09 0.31 0.40 0.52 1.43 ▆▇▂▁▁
X1724.90873224483 14075 0.23 0.70 0.43 0.06 0.39 0.57 0.90 3.52 ▇▃▁▁▁
X1725.90758432642 14565 0.20 0.50 0.31 0.07 0.29 0.41 0.63 2.84 ▇▂▁▁▁
X1754.93218998318 14471 0.21 0.35 0.14 0.05 0.24 0.33 0.42 1.04 ▃▇▂▁▁
X1812.96278802887 14533 0.21 0.28 0.10 0.06 0.21 0.26 0.33 0.84 ▃▇▂▁▁
X1907.98379041504 14566 0.20 0.18 0.06 0.02 0.14 0.18 0.22 0.47 ▂▇▅▁▁
X2032.06680023288 13888 0.24 0.18 0.06 0.03 0.14 0.17 0.21 0.45 ▂▇▅▁▁
X2033.10347402711 13164 0.28 0.18 0.07 0.01 0.14 0.18 0.22 0.57 ▂▇▂▁▁
X2055.07059438912 13944 0.24 0.16 0.05 0.02 0.12 0.15 0.19 0.41 ▂▇▅▁▁
X2106.08527885176 13936 0.24 0.14 0.05 0.00 0.10 0.13 0.17 0.37 ▂▇▅▁▁
X2128.13844472021 14591 0.20 0.16 0.06 0.01 0.12 0.15 0.19 0.39 ▂▇▅▁▁
X2138.11220461466 10229 0.44 0.16 0.06 0.01 0.12 0.15 0.20 0.54 ▃▇▂▁▁
X2166.14516832135 12139 0.34 0.23 0.15 0.01 0.13 0.19 0.30 1.20 ▇▃▁▁▁
X2400.27542672266 12868 0.30 0.11 0.05 0.00 0.07 0.10 0.14 0.41 ▅▇▂▁▁
X2401.27062857394 14294 0.22 0.10 0.04 0.00 0.07 0.10 0.13 0.30 ▃▇▃▁▁
X2403.28670488561 13332 0.27 0.16 0.09 0.01 0.10 0.14 0.20 0.74 ▇▅▁▁▁
X2404.29130915879 14226 0.22 0.16 0.09 0.01 0.10 0.14 0.20 0.78 ▇▅▁▁▁

we observe that the missing data is not uniform in the mz, there are some values for which only 20 - 30% of the pixel have a value, and this tends to be small, this is specially true in the for the large values of mz in this case

we replace the missing data with 0 since it means the data for that mz was under threshold

P0 = P
P0[is.na(P0)] = 0

correlation matrix

cm <- cor(P0)
corrplot(cm, method = "color", tl.pos = 'n')

we can see a correlation between the different mz values in blocks, we have the highest mz that seem o be unncorrelate to everithing else

preliminary plotts

pixels = read.table("/Users/macbookpro/Documents/Bayesian Statistics/Project/Raw_data/Peptidi/154 variabili/101_peptidi-PreProcessed-XYCoordinates-Step1-Step2-Step4-Step5-101.txt")
colnames(P0) = substr(colnames(P0),1,5)
colnames(pixels) = c("x","y")
max_n_of_pixel = read.table("/Users/macbookpro/Documents/Bayesian Statistics/Project/Raw_data/Peptidi/154 variabili/101_peptidi-PreProcessed-maxXY-Step1-Step2-Step4-Step5-101.txt")
Data_long            = as_tibble(data.frame( pixels, P0 ))
max_number_of_pixels = apply(Data_long[,1:2],2,max)

Data_array = matrix(NA,max_number_of_pixels[1],max_number_of_pixels[2])

Data_array = array(NA,c(max_number_of_pixels[1],max_number_of_pixels[2],ncol(P0)))

# there must be a better way to do this
for(k in 1:ncol(P0)){
  for(i in 1:nrow(Data_long)){
  Data_array[Data_long$x[i],Data_long$y[i],k] = P0[i,k]
  }
}

dim(Data_array)
##   x   y     
## 157 178 154
Data_very_long = reshape2::melt(Data_long,c("x","y")) %>% mutate(pixel_ind = paste0(x,"_",y), value_ind = rep(1:nrow(Data_long),ncol(P0)))
Data_very_long = Data_very_long %>% group_by(pixel_ind) %>% mutate(n = row_number()) %>% ungroup() %>% mutate(mz = as.numeric(substr(variable,2,5)))
Data_very_long = reshape2::melt(Data_long,c("x","y")) %>% mutate(pixel_ind = paste0(x,"_",y), value_ind = rep(1:nrow(Data_long),ncol(P0)))

Data_very_long = Data_very_long %>% group_by(pixel_ind) %>% mutate(n = row_number()) %>% ungroup() %>% mutate(mz = as.numeric(substr(variable,2,5)))


# subsampling to get a faster plot and not drain memory
sub_ind = sample(unique(Data_very_long$pixel_ind),1000)
# just to get the gist:
ggplot(Data_very_long %>% filter(pixel_ind %in% sub_ind))+
  geom_path(aes(x = mz, y = value, 
                col=pixel_ind, 
                group = pixel_ind),alpha=.5)+theme_bw()+theme(legend.position = "none")+xlab("m.z")+scale_color_viridis_d(option = "A")+
  scale_x_continuous(n.breaks = 20)

mz_values <-  colnames(P0)

investigating the different peaks

peaks arround 700

here we can see the first peaks that show a distinctive shape

peak arround 750

this show the same pattern as before

peak at 800

still the same pattern the peak is at 811

peak arround 840

we have the same spots as in the glicani

peaks arround 840

we still have this spots and some edge activations, possible biological meaning?

peaks arround 860

same patterns

peak arround 870

peak around 900

complementary to the main pattern, not too high of spikes

peak arround 940

same patterns

same patterns

peak arround 1040

we have some spots that seem to be outlier, and the rest is just the same structure

same pattern

spike arroun 1100

same patter

the rest is just noise and mow values of the same structure

A comprehensive look at the peaks

PCA

pca = princomp(P0)
plot(pca)

summary(pca)
## Importance of components:
##                            Comp.1     Comp.2     Comp.3     Comp.4     Comp.5
## Standard deviation     40.3991506 15.9743645 8.13712111 7.29852391 5.84229587
## Proportion of Variance  0.7289469  0.1139721 0.02957284 0.02379148 0.01524469
## Cumulative Proportion   0.7289469  0.8429190 0.87249188 0.89628337 0.91152806
##                           Comp.6     Comp.7     Comp.8      Comp.9     Comp.10
## Standard deviation     5.3612942 5.06501681 4.99080393 3.258097082 2.915194794
## Proportion of Variance 0.0128378 0.01145812 0.01112481 0.004741104 0.003795654
## Cumulative Proportion  0.9243659 0.93582397 0.94694878 0.951689885 0.955485539
##                            Comp.11     Comp.12     Comp.13     Comp.14
## Standard deviation     2.733224687 2.672351245 2.369943632 2.276792634
## Proportion of Variance 0.003336584 0.003189617 0.002508577 0.002315253
## Cumulative Proportion  0.958822123 0.962011740 0.964520317 0.966835570
##                            Comp.15    Comp.16     Comp.17     Comp.18
## Standard deviation     2.045413443 1.94957808 1.917887220 1.760352181
## Proportion of Variance 0.001868588 0.00169759 0.001642849 0.001384046
## Cumulative Proportion  0.968704158 0.97040175 0.972044597 0.973428643
##                            Comp.19     Comp.20    Comp.21     Comp.22
## Standard deviation     1.735462414 1.604566753 1.51780903 1.512897379
## Proportion of Variance 0.001345185 0.001149918 0.00102893 0.001022281
## Cumulative Proportion  0.974773827 0.975923746 0.97695268 0.977974957
##                             Comp.23      Comp.24      Comp.25      Comp.26
## Standard deviation     1.4779005303 1.4369422782 1.3526720350 1.3392595307
## Proportion of Variance 0.0009755327 0.0009222106 0.0008172153 0.0008010893
## Cumulative Proportion  0.9789504894 0.9798727000 0.9806899152 0.9814910045
##                             Comp.27      Comp.28      Comp.29      Comp.30
## Standard deviation     1.3301421399 1.3006730276 1.2638180367 1.2450017975
## Proportion of Variance 0.0007902191 0.0007555926 0.0007133794 0.0006922953
## Cumulative Proportion  0.9822812237 0.9830368163 0.9837501957 0.9844424910
##                             Comp.31      Comp.32      Comp.33      Comp.34
## Standard deviation     1.2196069006 1.1878905928 1.1674621780 1.1523563644
## Proportion of Variance 0.0006643412 0.0006302376 0.0006087473 0.0005930961
## Cumulative Proportion  0.9851068322 0.9857370698 0.9863458172 0.9869389132
##                             Comp.35      Comp.36      Comp.37      Comp.38
## Standard deviation     1.1096610004 1.0699072836 1.0542070594 1.0063699005
## Proportion of Variance 0.0005499612 0.0005112623 0.0004963674 0.0004523418
## Cumulative Proportion  0.9874888745 0.9880001367 0.9884965042 0.9889488460
##                             Comp.39      Comp.40      Comp.41      Comp.42
## Standard deviation     0.9853022068 0.9773389411 0.9610894117 0.9146349955
## Proportion of Variance 0.0004336011 0.0004266206 0.0004125523 0.0003736346
## Cumulative Proportion  0.9893824471 0.9898090677 0.9902216200 0.9905952546
##                             Comp.43      Comp.44      Comp.45      Comp.46
## Standard deviation     0.8894101221 0.8665821721 0.8556113679 0.8100176747
## Proportion of Variance 0.0003533097 0.0003354061 0.0003269675 0.0002930491
## Cumulative Proportion  0.9909485643 0.9912839704 0.9916109378 0.9919039870
##                             Comp.47      Comp.48      Comp.49      Comp.50
## Standard deviation     0.7816382686 0.7679075614 0.7618670460 0.7391997202
## Proportion of Variance 0.0002728746 0.0002633718 0.0002592447 0.0002440479
## Cumulative Proportion  0.9921768615 0.9924402334 0.9926994780 0.9929435259
##                             Comp.51      Comp.52      Comp.53     Comp.54
## Standard deviation     0.7294960438 0.7029509446 0.7015565015 0.685351777
## Proportion of Variance 0.0002376826 0.0002206996 0.0002198248 0.000209787
## Cumulative Proportion  0.9931812085 0.9934019080 0.9936217329 0.993831520
##                             Comp.55      Comp.56     Comp.57      Comp.58
## Standard deviation     0.6787912683 0.6682097030 0.648940398 0.6453566954
## Proportion of Variance 0.0002057898 0.0001994238 0.000188088 0.0001860163
## Cumulative Proportion  0.9940373097 0.9942367335 0.994424821 0.9946108378
##                             Comp.59      Comp.60      Comp.61      Comp.62
## Standard deviation     0.6336153480 0.6240019021 0.6072965336 0.6044566256
## Proportion of Variance 0.0001793093 0.0001739095 0.0001647225 0.0001631856
## Cumulative Proportion  0.9947901471 0.9949640566 0.9951287791 0.9952919647
##                             Comp.63      Comp.64      Comp.65      Comp.66
## Standard deviation     0.5992822507 0.5930071286 0.5862482479 0.5667055393
## Proportion of Variance 0.0001604037 0.0001570621 0.0001535022 0.0001434387
## Cumulative Proportion  0.9954523684 0.9956094304 0.9957629326 0.9959063713
##                             Comp.67      Comp.68      Comp.69      Comp.70
## Standard deviation     0.5565288927 0.5518703336 0.5466714688 0.5414292616
## Proportion of Variance 0.0001383333 0.0001360271 0.0001334763 0.0001309287
## Cumulative Proportion  0.9960447047 0.9961807318 0.9963142081 0.9964451368
##                             Comp.71     Comp.72      Comp.73      Comp.74
## Standard deviation     0.5245472376 0.519162362 0.5166308171 0.5086563119
## Proportion of Variance 0.0001228912 0.000120381 0.0001192098 0.0001155581
## Cumulative Proportion  0.9965680280 0.996688409 0.9968076188 0.9969231769
##                             Comp.75      Comp.76      Comp.77      Comp.78
## Standard deviation     0.5071022703 0.4945808853 0.4899254174 0.4877170331
## Proportion of Variance 0.0001148531 0.0001092512 0.0001072041 0.0001062398
## Cumulative Proportion  0.9970380299 0.9971472811 0.9972544852 0.9973607250
##                             Comp.79      Comp.80      Comp.81      Comp.82
## Standard deviation     0.4819257010 0.4744127957 0.4666743708 4.617112e-01
## Proportion of Variance 0.0001037317 0.0001005227 0.0000972701 9.521211e-05
## Cumulative Proportion  0.9974644567 0.9975649794 0.9976622495 9.977575e-01
##                             Comp.83      Comp.84      Comp.85      Comp.86
## Standard deviation     4.498147e-01 4.399089e-01 4.377205e-01 4.342643e-01
## Proportion of Variance 9.036884e-05 8.643247e-05 8.557465e-05 8.422862e-05
## Cumulative Proportion  9.978478e-01 9.979343e-01 9.980198e-01 9.981041e-01
##                             Comp.87      Comp.88      Comp.89      Comp.90
## Standard deviation     4.297765e-01 4.285246e-01 0.4239960143 4.070508e-01
## Proportion of Variance 8.249673e-05 8.201681e-05 0.0000802925 7.400287e-05
## Cumulative Proportion  9.981866e-01 9.982686e-01 0.9983488723 9.984229e-01
##                             Comp.91      Comp.92      Comp.93      Comp.94
## Standard deviation     4.052982e-01 3.962946e-01 3.900365e-01 3.785081e-01
## Proportion of Variance 7.336702e-05 7.014357e-05 6.794569e-05 6.398849e-05
## Cumulative Proportion  9.984962e-01 9.985664e-01 9.986343e-01 9.986983e-01
##                             Comp.95      Comp.96      Comp.97      Comp.98
## Standard deviation     3.731613e-01 3.334123e-01 3.272569e-01 0.3236794842
## Proportion of Variance 6.219346e-05 4.964947e-05 4.783316e-05 0.0000467931
## Cumulative Proportion  9.987605e-01 9.988102e-01 9.988580e-01 0.9989047891
##                             Comp.99     Comp.100     Comp.101     Comp.102
## Standard deviation     0.3169676141 3.145463e-01 3.077433e-01 3.072566e-01
## Proportion of Variance 0.0000448726 4.418966e-05 4.229885e-05 4.216517e-05
## Cumulative Proportion  0.9989496617 9.989939e-01 9.990362e-01 9.990783e-01
##                            Comp.103     Comp.104     Comp.105     Comp.106
## Standard deviation     3.031456e-01 3.005097e-01 2.978941e-01 2.935509e-01
## Proportion of Variance 4.104442e-05 4.033374e-05 3.963466e-05 3.848737e-05
## Cumulative Proportion  9.991194e-01 9.991597e-01 9.991993e-01 9.992378e-01
##                            Comp.107     Comp.108     Comp.109     Comp.110
## Standard deviation     2.887736e-01 2.877916e-01 2.843650e-01 0.2818193309
## Proportion of Variance 3.724487e-05 3.699199e-05 3.611635e-05 0.0000354726
## Cumulative Proportion  9.992751e-01 9.993121e-01 9.993482e-01 0.9993836414
##                            Comp.111     Comp.112     Comp.113     Comp.114
## Standard deviation     2.746143e-01 0.2711894496 0.2627414775 2.555900e-01
## Proportion of Variance 3.368198e-05 0.0000328471 0.0000308325 2.917691e-05
## Cumulative Proportion  9.994173e-01 0.9994501704 0.9994810029 9.995102e-01
##                            Comp.115     Comp.116     Comp.117     Comp.118
## Standard deviation     2.506924e-01 2.460048e-01 2.425202e-01 2.379811e-01
## Proportion of Variance 2.806944e-05 2.702953e-05 2.626922e-05 2.529509e-05
## Cumulative Proportion  9.995382e-01 9.995653e-01 9.995915e-01 9.996168e-01
##                            Comp.119     Comp.120     Comp.121     Comp.122
## Standard deviation     2.318558e-01 2.263664e-01 2.197474e-01 2.179916e-01
## Proportion of Variance 2.400973e-05 2.288629e-05 2.156747e-05 2.122418e-05
## Cumulative Proportion  9.996409e-01 9.996637e-01 9.996853e-01 9.997065e-01
##                            Comp.123     Comp.124     Comp.125     Comp.126
## Standard deviation     2.153974e-01 2.116471e-01 2.097317e-01 2.064405e-01
## Proportion of Variance 2.072204e-05 2.000673e-05 1.964624e-05 1.903449e-05
## Cumulative Proportion  9.997273e-01 9.997473e-01 9.997669e-01 9.997859e-01
##                            Comp.127     Comp.128     Comp.129     Comp.130
## Standard deviation     2.028313e-01 0.2000042102 0.1836679365 1.790292e-01
## Proportion of Variance 1.837474e-05 0.0000178661 0.0000150667 1.431525e-05
## Cumulative Proportion  9.998043e-01 0.9998221811 0.9998372478 9.998516e-01
##                            Comp.131     Comp.132     Comp.133     Comp.134
## Standard deviation     1.739702e-01 1.711473e-01 1.687172e-01 1.667773e-01
## Proportion of Variance 1.351764e-05 1.308253e-05 1.271364e-05 1.242297e-05
## Cumulative Proportion  9.998651e-01 9.998782e-01 9.998909e-01 9.999033e-01
##                            Comp.135     Comp.136     Comp.137     Comp.138
## Standard deviation     1.641567e-01 1.531942e-01 1.490205e-01 1.473419e-01
## Proportion of Variance 1.203563e-05 1.048181e-05 9.918439e-06 9.696248e-06
## Cumulative Proportion  9.999153e-01 9.999258e-01 9.999357e-01 9.999454e-01
##                            Comp.139     Comp.140     Comp.141     Comp.142
## Standard deviation     1.377268e-01 1.342370e-01 1.096922e-01 1.087324e-01
## Proportion of Variance 8.472046e-06 8.048144e-06 5.374063e-06 5.280429e-06
## Cumulative Proportion  9.999539e-01 9.999620e-01 9.999673e-01 9.999726e-01
##                            Comp.143     Comp.144     Comp.145     Comp.146
## Standard deviation     1.068803e-01 8.395213e-02 8.043323e-02 7.652334e-02
## Proportion of Variance 5.102076e-06 3.147856e-06 2.889498e-06 2.615407e-06
## Cumulative Proportion  9.999777e-01 9.999809e-01 9.999837e-01 9.999864e-01
##                            Comp.147     Comp.148     Comp.149     Comp.150
## Standard deviation     7.579695e-02 7.097042e-02 6.949090e-02 6.823466e-02
## Proportion of Variance 2.565990e-06 2.249605e-06 2.156787e-06 2.079513e-06
## Cumulative Proportion  9.999889e-01 9.999912e-01 9.999933e-01 9.999954e-01
##                            Comp.151     Comp.152     Comp.153     Comp.154
## Standard deviation     5.967253e-02 5.619498e-02 4.468675e-02 3.941605e-02
## Proportion of Variance 1.590378e-06 1.410414e-06 8.918851e-07 6.939011e-07
## Cumulative Proportion  9.999970e-01 9.999984e-01 9.999993e-01 1.000000e+00

the pca works well we have 92% explane d variance with 6 components

PCA1 = ggplot(Data_long %>% mutate(pca1 = pca$scores[,1]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca1))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA2 = ggplot(Data_long %>% mutate(pca2 = pca$scores[,2]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca2))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA3 = ggplot(Data_long %>% mutate(pca3 = pca$scores[,3]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca3))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA4 = ggplot(Data_long %>% mutate(pca4 = pca$scores[4]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca4))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA5 = ggplot(Data_long %>% mutate(pca5 = pca$scores[,5]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca5))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA6 = ggplot(Data_long %>% mutate(pca6 = pca$scores[,6]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca6))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")


PCA1+PCA2+PCA3+PCA4+PCA5+PCA6

we can clearly see the main patterns in the data

comparing the pca scores with the data

PCA1 = ggplot(Data_long %>% mutate(pca1 = pca$scores[,1]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca1))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")

PCA3 = ggplot(Data_long %>% mutate(pca3 = pca$scores[,3]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca3))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")

PCA5 = ggplot(Data_long %>% mutate(pca5 = pca$scores[,5]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca5))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
PCA6 = ggplot(Data_long %>% mutate(pca6 = pca$scores[,6]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca6))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
P1 = ggplot(Data_long)+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = X705.))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
P2 = ggplot(Data_long)+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = X859.))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")


PCA1+PCA3+PCA5+PCA6+P1+P2

corresponding conponent

these are the conponentss of the main shape

the sacond pca score is X842

PCA2 = ggplot(Data_long %>% mutate(pca2 = pca$scores[,2]))+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = pca2))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")
P2 = ggplot(Data_long)+ theme_bw()+
  geom_tile(aes(x=x,y=y,fill = X842.))+scale_fill_viridis_c(option = "A",na.value = "red")+
  theme_void()+theme(legend.position = "bottom")

PCA2+P2

corresponding conponent

as expected is just 842

should we do somehing about this?